;VWF code, Version Beta 1 ; This is the ASM source code for the variable-width-font hack for Dragon Quest Monsters Caravan Heart ; Written by KaioShin ; If you want to contact me go through my website or the romhacking.net message board ; http://kaioshin.romhacking.net ; January 2007 ; ; The source code contains some redundancies. I decided to let them be, since I was so happy it actually worked and ; I didn't want to introduce possible new bugs ;) So it's not the most optimized code, but I tried to make the comments ; as understandable as possible, I hope it's of use for you! ; The code is for the GBA obviously, but the concepts can be used on any system once you understand it. ; Enjoy and don't hesitate to leave feedback ; Start ; Parameter: r0 contains number (table value) of the current char ;Init @Thumb @org 0x0829E5CC ; Location of the routine @define ram_base_data 0x03002830 ; At this location in RAM the game stores information about the text @define dma3_base 0x040000D4 ; Source Address for DMA Channel 3 @define tile_RAM1 0x02005004 ; Location of Buffer Tile in RAM @define tile_RAM2 0x02005060 ; Location of current new Tile in RAM @define tile_RAM3 0x02005120 ; Location of completed vwf Tile which gets sent to VRAM @define width_table 0x086C75F0 ; Location of the width table, 1 Byte each, from 1 - 8 @define width_RAM 0x2005000 ; Width of the buffer tile (spill over) @define width_RAM2 0x2005001 ; Width of Current new Tile @define font_location 0x086B5954 ; Location of 8x16 font @define fill_mask 0x11111111 ; Mask for filling holes, see code for more details push {R4,R5,R14} ; The original routine saves these as well, they are later used agai lsl R0,R0,0x18 lsr R0,R0,0x18 ; From original routine, seems to cut off greater values than 0xFF ; The game calls the DMA routine not only with the 8x16 font, but also with a 8x8 one and in other instances too ; This code ensures that we only do the vwf with the 8x16 font. All other cases are redirected to an unaltered copy ; of the original game's DMA routine. The code will be expanded later to cover the 8x8 font as well ldr R5,=font_location ldr R6,=ram_base_data ; ldr R6,[R6,0x4] ; This RAM location holds a pointer to the current font cmp R5,R6 ; compare with the location of the 8x16 bne ExcludeRoutine ; If they don't match jump to the old routine ; First we get the width of the current character ldr R5,=width_table add R5,R0,R5 ; R0 holds the current char number (= table value) ldrb R5,[R5] ldr R1,=width_RAM2 strb R5,[R1] ; load the width and store it at the current tile's width location ldr R5,=width_RAM ldrb R5,[R5] ; load the current buffer tile's width cmp R5, #0 ; We have to check if the buffer is currently empty or not bne WidthIsBiggerThan0 ; The following code will run if there is currently no buffer in RAM. We will just copy a Tile from the Font into RAM ldr R5,=ram_base_data ; Set pointer ldrb R1,[R5,0x10] ; r1 = [3002840] lsl R1,R1,0x1F ; lsr R1,r1,0x1F ; add R1,0x1 ; lsl R1,R1,0x15 ; lsr R2,R1,0x10 ; mul R0,R2 ; Loads tile offset for the font ldr R4,[R5,0x4] ; Loads font pointer into r4 add R4,R4,R0 ; Adds the offset to the font pointer, it now points ; to the exact tile we'll use ldr R0,=tile_RAM1 ; Points to RAM location where we want the tile -> Buffer Tile ldr R3,=dma3_base str R4,[R3] ; DMA Source Address = r3 = Tile address in font str R0,[R3,0x4] ; DMA Destination Address = r0 = Buffer Tile lsr R1,R1,0x12 ; The next 4 lines built the control word for the DMA control ; register mov R0,0x84 lsl R0,R0,0x18 orr R1,R0 str R1,[R3,0x8] ; store control word -> DMA start ; The current character's tile is our current Buffer Tile now ldr R5,=width_RAM2 ldrb R5,[R5] ldr R4,=width_RAM strb R5,[R4] ; We now save the current width as the buffer width ; Next we'll check if the Buffer is already full. Sometimes we fill an empty buffer right away with a big tile of width 8 ; In this case we'll have to send it to VRAM and clear the buffer again cmp r5, 0x8 bge NewTile8 ; Calls code when tile is already full b Exit ; If the tile isn't so big we'll exit. Nothing more to do now ExcludeRoutine: ; This is the original routine, it's called when ; we don't work with the 8x16 font ldr r5,=0x3002830 ldrb r1,[r5,0x10] lsl r1,r1,0x1F lsr r1,r1,0x1F add r1,0x1 lsl r1,r1,0x15 lsr r2,r1,0x10 mul r0,r2 ldr r4,[r5,0x4] add r4,r4,r0 ldrb r0,[r5,0x16] mul r2,r0 ldr r0,[r5,0x8] add r2,r2,r0 ldr r3,=0x40000D4 str r4,[r3] str r2,[r3,0x4] lsr r1,r1,0x12 mov r0,0x84 lsl r0,r0,0x18 orr r1,r0 str r1,[r3,0x8] ldr r0,[r3,0x8] ldrb r0,[r5,0x16] add r0,0x1 ldrb r1,[r5,0x16] strb r0,[r5,0x16] b Exit ; The following code is called when we get a new buffer tile which is already full. We'll send it to the RAM location ; for complete tiles NewTile8: mov R5,#0 strb R5,[R4] ; First we'll clear the buffers width ldr R4,=tile_RAM1 ; Our source address - the Buffer Tile ldr R0,=tile_RAM3 ; Destination address - Complete Tile ldr R3,=dma3_base str R4,[R3] ; str R0,[R3,0x4] ; Store the source / destination lsr R1,R1,0x12 mov R0,0x84 ; Build Control word lsl R0,R0,0x18 ; " orr R1,R0 ; " str R1,[R3,0x8] ; Store Control word - DMA start ldr R4,=width_RAM mov R3,#8 ; strb R3,[R4] ; This code is needed because of a redundancy. The called routine will ; clear the buffer's width although we already did it in this case ; Well, I'm too lazy to fix this and it won't hurt, I told you this isn't ; optimized :p b HigherThan8 ; Jump to the part which sends RAM3 -> VRAM ; Let's go back to the beginning - We just just saw there is something in the Buffer Tile ; This means our current Tile will be copied into the Current Tile RAM position instead of the buffer position WidthIsBiggerThan0: ldr R5,=ram_base_data ; Set pointer ldrb R1,[R5,0x10] ; lsl R1,R1,0x1F ; lsr R1,r1,0x1F ; add R1,0x1 ; lsl R1,R1,0x15 ; lsr R2,R1,0x10 ; mul R0,R2 ; Loads tile offset for the font ldr R4,[R5,0x4] ; Loads font pointer into r4 add R4,R4,R0 ; Adds the offset to the font pointer, now points ; to the exact tile we'll use ldr R0,=tile_RAM2 ; points to RAM location where we want the tile -> Current Tile ldr R3,=dma3_base str R4,[R3] ; DMA Source Address = r3 = Tile address in font str R0,[R3,0x4] ; DMA Destination Address = r0 = tile_RAM2 lsr R1,R1,0x12 mov R0,0x84 ; Control word stuff lsl R0,R0,0x18 orr R1,R0 str R1,[R3,0x8] ; store Control word -> DMA start ; Tile should be in RAM now ; Now we get to the important part. We have a rest in the buffer and our new tile. We'll combine them now ; to build a new tile ; ; This is a VWF's heart, so I will show you the basics first off ; ; A tile is made of several rows of pixels, in the case of a 8x16 font of 16 rows. Every row is made of 4 Bytes ; 4 Bit describe one Pixel. In case you didn't know this already, 4 bits are called "nibble" in CS terminology. ; A GBA register holds 32 Bit, which means exactly one row of pixels. This is perfect at it makes the combining very easy. ; We'll load a row of both Tiles into seperate registers now, the end will look like this: ; ; R0: 17111111 (Buffer) ; R1: 71771711 (Current Tile) ; ; We assume 1 means no pixel and 7 means there is a pixel. The width of the buffer is 4 and the width of the new tile is 6 ; Now we'll have to create a new row for a new pixel which combines these two rows. The process is very very simple, you'll ; be suprised. Let's look how the end result will have to look like: ; ; R2: 17117177 - 71 ; The Part of the Current Tile has to be shifted behind the Buffer. You'll notice that the current tile doesn't completely ; fit into one row with the old one. This is spill over. ; That's already all there is to a VWF. We'll now implement this in ASM. The spill over part will become our new Buffer for ; the next tile ; Combining starts here ldr R0,=tile_RAM1 ; Pointer to Buffer tile ldr R1,=tile_RAM2 ; POinter to Current Tile ldr R2,=tile_RAM3 ; Pointer to Result Tile mov R6, #0 ; This is out loop counter loop: ; We'll loop through this 16 times since we have 16 rows ldr R3,=width_RAM ldrb R3,[R3] ; We load the Buffer width. This is the base by which we'll shift the ; current tile mov R4, #4 mov r7,r4 mul r7,r6 ; One row is 4 bytes of data. This will create our offset for the pointer ; to the current row. When we go through the loop a second time we want the ; second 4 bytes - not the second byte. So we multiply the loop counter by 4 ldr R4,[R1,R7] ; Load Current Tile row ldr R5,[R0,R7] ; Load Buffer Tile row push {R1,R3} ; I ran out of registers here, ups :p So we push R1 and R3 for a moment ; so we can use them without losing the values mov R1, 4# mul R3, R1 ; We multiply the width by 4. The reason is that our width value represents ; pixels. However we want to shift the nibbles (4 bits). Got it? lsl R4,R3 ; So we want to shift the pixels of the Current Tile to the right. Why shift ; left here then? ; Short insert about GBA endianess: ; The GBA stores data in RAM with little endianess. This means they are stored in reverse byte order. FF000000 will be ; stored as 000000FF. Our font is stored in the correct order in RAM, the left pixels are the left nibbles. When we load ; the rows into registers it will reverse them! When we store them back they are back to normal order. This is why we have ; to shift the pixels to the left instead of to the right! pop {R1,r3} ; Shift complete, get our old values back orr R4,R5 ; Now we combine the Buffer row and our shifted current row. Just or them ; Yes, it's that easy ^_^ str R4,[r2,R7] ; We save the combined row into the result RAM location ; Done? Not quite, now we shouldn't forget about the spill over! We most likely shifted some parts of the Current Tile ; out of the row. We catch this now ldr R4,[R1,R7] ; Reload current tile's row mov r5, 0x8 ; Time for some math. We have a buffer width of 5. Now we shift the Current ; Tile by that amount. The amount of pixels shifted out of the row is always ; 8 - buffer width, so 3 pixels in this example. sub r5,r5,r3 ; r5 contains the width of spill over now mov r3,0x4 mul r3,r5 ; Time to get the spill over back. We mutiply the spill's width 4 ; - (nibbles vs pixel, remember?) and shift the row by this amount lsr r4,r3 ; in the opposite direction than before now. ; Quick example to show what's actually happening ; R0: 11223344 shift to left by 5: 34400000 ; The spill over we are looking for would be 11223 ; 8-5 = 3 ; R0: 11223344 shift to the right by 3: 00011223 ; Magic! str R4,[R0,R7] ; Overwrite Buffer Tile add R6,R6,#1 ; Increase loop counter cmp R6,0x10 bne loop ; continue for 16 rows of pixels (8x16) ldr R3,=width_RAM ; ldrb R3,[R3] ldr R2,=width_RAM2 ldrb R2,[R2] ; Now we calculate how big the two combined tiles would be add R3,R3,R2 ; cmp r3, 0x8 ; bge HigherThan8 ; We want to know if the width is 8 or more pixels, or rather if a new ; is full or not. If it's full we'll send the tile to VRAM ; If the tile isn't full we continue here ldr R2,=width_RAM strb r3,[r2] ; We store the size of the new tile ; Now we'll copy the result (our new tile) to the Buffer Tile ldr R4,=tile_RAM3 ; Set pointer to combined tile ldr R0,=tile_RAM1 ; points to Buffer Tile ldr R3,=dma3_base str R4,[R3] ; I think you know the rest now... str R0,[R3,0x4] ; lsr R1,R1,0x12 mov R0,0x84 lsl R0,R0,0x18 orr R1,R0 str R1,[R3,0x8] ; b Exit ; Done with this pass HigherThan8: ; if it's bigger than 8 the new tile is full and has to be send to VRAM ; First off, calculate new width of buffer tile sub r3,0x8 ldr R4,=width_RAM ; strb R3,[R4] ; store width of new Buffer Tile ;send to VRAM now - you should really know this part by now :p ldr r5,=ram_base_data ldrb r1,[r5,0x10] lsl r1,r1,0x1F lsr r1,r1,0x1F add r1,0x1 lsl r1,r1,0x15 lsr r2,r1,0x10 mul r0,r2 ;ldr r4,[r5,0x4] ;add r4,r4,r0 ldr r4,=tile_RAM3 ldrb r0,[r5,0x16] mul r2,r0 ldr r0,[r5,0x8] add r2,r2,r0 ldr r3,=0x40000D4 str r4,[r3] str r2,[r3,0x4] lsr r1,r1,0x12 mov r0,0x84 lsl r0,r0,0x18 orr r1,r0 str r1,[r3,0x8] ldr r0,[r3,0x8] ldrb r0,[r5,0x16] add r0,0x1 ldrb r1,[r5,0x16] strb r0,[r5,0x16] Exit: ; Exit subroutine pop {r4,r5} pop {r0} bx r0 @pool ; explicitly tell the assembler to put the literal pool here. ; End of vwf main routine. Done? Not quite, there is one thing. We'll have to modify the end of line control code routine ; Otherwise, when a new line is reached, every potential spill over will be printed at the beginning of the new line instead ; of the end of the last line. So line one will end with half a g and line two will start with half a g. We don't want that, ; do we? @org 0x0828647C push {r0,r1,r5} ldr R5,=tile_RAM1 mov R4,0x0 ldr R2,=fill_mask loop2: ldr R3,[R5,R4] orr R3,R2 str R3,[R5,R4] add R4,0x4 cmp R4,0x40 bne loop2 ; This code will OR our Buffer with 0x11111111. The reason for this is that the spill over might contain zeroes. Remember, ; when we shifted the spill over the nibbles which are shifted in from the outside are all 0. (We had 00011223 in the ; example). A value of zero means that the pixel is not black, but transparent. This means the zeroes will produce "holes" ; in the message windows, through which you can see to the tiles behind them. By ORing the rows against ones we paint those ; holes black ; Next comes copying the Buffer Tile into VRAM - I won't copy and paste this part again. You should know it by now. ; Well, I should have made that a callable subroutine, I said the code is redundant :p ; ... copy to vram bla ... mov r1,0x0 ldr r0,=width_RAM strb r1,[r0] ; with this we set the buffers width back to zero, after all we just printed it. pop {R0,R1,R5} bx lr ; Done! ; That's it. I hope you found the source code and my explanations somewhat readable and informative. ; Copyright 2007 KaioShin